Model Selection

Low-resource language processing

# Low-resource language processing

Mbart50 Saraiki News Summarization

A Seraiki news summarization model fine-tuned based on the mBART-50 multilingual model, capable of generating concise summaries from Seraiki news content

Text Generation

Transformers Other

Aidman Wav2vec2 Large Xls R 300m Irish Colab

This is a speech recognition model fine-tuned on the Common Voice dataset based on facebook/wav2vec2-xls-r-300m, supporting Irish language.

Speech Recognition

The Camel Model is a text generation model based on the transformer architecture, supporting Azerbaijani and trained using reinforcement learning.

Large Language Model

Transformers Other

Whisper Fleurs Small Te In

This model is a fine-tuned version of OpenAI's Whisper Small on the FLEURS dataset, focusing on speech recognition tasks and supporting Telugu (te).

Speech Recognition

Transformers Other

Mt5 Sinhala News Finetunedv3

A text summarization model fine-tuned on Sinhala news data based on Google's mT5-small model

Text Generation

Transformers Other

XLM-RoBERTa-large fine-tuned Uzbek named entity recognition model supporting 21 entity types

Sequence Labeling

Transformers Other

Whisper Base Pl

A speech recognition model fine-tuned on the Polish Common Voice 17.0 dataset based on OpenAI Whisper-base

Speech Recognition

Transformers Other

Shark Finetuned Kde4 Ar En

Arabic-to-English translation model fine-tuned on the kde4 dataset based on Helsinki-NLP/opus-mt-ar-en

Machine Translation

Romaneng2nep V3

This model is fine-tuned from google/mt5-small for converting Romanized Nepali to Nepali text

Machine Translation

Transformers Supports Multiple Languages

Mms Tts Div Finetuned Md F02

This is a Transformer-based speech model supporting Dhivehi (Maldivian) speech processing tasks.

Large Language Model

Transformers Other

Mt5 XLSUM Ua News

A headline generation model fine-tuned on Ukrainian news datasets based on the multilingual mT5 model, capable of generating concise and accurate headlines for Ukrainian news articles.

Text Generation

Transformers Other

Whisper Sinhala Audio To Text

A Sinhala speech recognition model fine-tuned based on openai/whisper-small, supporting conversion of Sinhala speech to text.

Speech Recognition

Whisper Small Kyrgyz

Kyrgyz automatic speech recognition (ASR) model based on the Whisper architecture, developed with support from the National Commission on Language and Language Policy under the President of the Kyrgyz Republic

Speech Recognition

Transformers Other

Kubert Central Kurdish BERT Model

KuBERT is a Central Kurdish language model based on the BERT framework, designed to address the scarcity of Kurdish language resources and enhance computational linguistics capabilities.

Large Language Model

Mt5 Small Amharic Text Summaization

A fine-tuned Amharic text summarization model based on google/mt5-small, suitable for news article headline generation tasks.

Text Generation

Mmlw Roberta Base

A Polish sentence embedding model based on RoBERTa architecture, focusing on sentence similarity calculation and feature extraction tasks.

Transformers Other

Nllb Clip Base Siglip

NLLB-CLIP-SigLIP is a multilingual vision-language model that combines the text encoder from NLLB and the image encoder from SigLIP, supporting 201 languages.

M2m100 1.2B Ft Ru Kbd 63K

A translation model fine-tuned on Russian-Kabardian datasets based on facebook/m2m100_1.2B

Machine Translation

Transformers Other

Sinhala Roberta Sentence Transformer

This is a sentence-transformers based model for mapping Sinhala sentences into a 768-dimensional vector space, supporting tasks like sentence similarity calculation and semantic search.

MLEAFIT Es2ptt5

This is a Spanish-to-Portuguese translation model fine-tuned based on the T5-small architecture, trained on the tatoeba dataset, with an evaluated BLEU score of 11.2994.

Machine Translation

Bodo Roberta Base

This is a Bodo language configuration model based on the RoBERTa architecture, including a byte-level BPE tokenizer for Bodo and RoBERTa base configuration.

Large Language Model

Whisper Small Haitian

This model is a fine-tuned version of whisper-small-cv11-french, optimized for Haitian Creole speech recognition

Speech Recognition

Bert Restore Punctuation Turkish

This is a Transformer model for Turkish text punctuation restoration, capable of predicting the correct positions of periods (.), commas (,), and question marks (?).

Sequence Labeling

Transformers Other

Glot500 is a multilingual pre-trained model that supports over 500 languages and is trained based on the masked language modeling (MLM) objective.

Large Language Model

Tags Allnli GroNLP Bert Base Dutch Cased

Dutch BERT-based sentence embedding model that maps text to a 768-dimensional vector space, suitable for semantic similarity calculation and text classification tasks

Transformers Other

Mt5 Small HunSum 1

Hungarian abstractive summarization model trained on the mT5-small architecture using the HunSum-1 dataset

Text Generation

Transformers Other

Whisper Small Yoruba

This model is a fine-tuned version of openai/whisper-small on the google/fleurs yo_ng dataset, designed for automatic speech recognition tasks in Yoruba.

Speech Recognition

Whisper Small Sk Cv11

Slovak speech recognition model fine-tuned on OpenAI Whisper-small, trained on the Common Voice 11.0 Slovak dataset

Speech Recognition

Transformers Other

Slovakbert Skquad

This model is a Q&A model fine-tuned on the Slovak language dataset skquad based on SlovakBERT

Question Answering System

Transformers Other

TUKE-DeutscheTelekom

A language model pretrained on large-scale Bulgarian and Macedonian texts, part of the MaCoCu project

Large Language Model Other

Estonian feature extraction model fine-tuned based on XLM-RoBERTa base model

Transformers Other

A multilingual processing model supporting over 100 languages and writing systems, covering major global language families and dialect variants

Large Language Model

Transformers Supports Multiple Languages

Marian Finetuned Kde4 En To Ar

This model is an English-to-Arabic translation model fine-tuned on the kde4 dataset based on Helsinki-NLP/opus-mt-en-ar.

Machine Translation

Mbart Finetuned Fa

A generative summarization model fine-tuned on Persian summarization datasets based on MBART-large-50

Text Generation

Transformers Other

Mt5 Base Finetuned Fa

A Persian summarization model fine-tuned on pn_summary dataset based on google/mt5-base

Text Generation

Transformers Other

Mt5 Multilingual XLSum Finetuned Fa Finetuned Ar

A multilingual summarization model based on mT5, specifically fine-tuned for Arabic on the XLSum dataset

Text Generation

Transformers Arabic

Mt5 Base Finetuned Urdu

This model is a fine-tuned summarization model based on google/mt5-base on the Urdu xlsum dataset

Text Generation

Transformers Other

Wav2vec2 Large Xls R 300m Urdu Cv8 200epochs

Urdu speech recognition model trained on Common Voice dataset, using wav2vec 2.0 architecture

Speech Recognition

Arabart Finetuned Ar

A text summarization model fine-tuned on the Arabic summarization dataset xlsum based on the AraBART model

Text Generation

Fullstop Catalan Punctuation Prediction

This model is used to predict punctuation marks in Catalan, capable of restoring periods, commas, question marks, hyphens, and colons.

Sequence Labeling

Transformers Other

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase